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Adaptive classification of temporal signals in 
fixed- weights recurrent neural networks: an 

existence proof 

Ivan Y. Tyukin* Danil Prokhorovj Cees van Leeuwen ^ 



^T) • Abstract 
, 

We address the important theoretical question why a recurrent neural network with 
fixed weights can adaptively classify time-varied signals in the presence of additive noise 
and parametric perturbations. We provide a mathematical proof assuming that unknown 
parameters are allowed to enter the signal nonlinearly and the noise amplitude is suffi- 
' ciently small. 

^ . 

Keywords: recurrent neural networks, adaptive classification, nonlinear parameterization 

O 1 Introduction 
cn 

' Recurrent Neural Networks (RNN) with fixed weights are known to be able to solve problems of 



adaptive classification, recognition, and control (Prokhorov et al., 2002; Feldkamp et al., 1996; 
Feldkamp & Puskorius, 1997; Younger et al., 1999; Lo, 2001). When the objects to be classified 
are static, e.g. still images or vectors in R", the way the fixed-weight RNN solves problems 
^ ' is usually characterized in terms of convergence of the RNN state to an attractor (Hopfield, 
5r ! 1982; Fuchs & Haken, 1988). Each attractor corresponds to a specific class of objects and its 
basin determines which objects belong to the class. Conditions specifying convergence to an 
attractor are widely available in this case, (Cohen & Grossberg, 1983; Michel et al., 1989; Yang 
& Dillon, 1994; Chen & Amari, 2001; Lu & Chen, 2003) to name a few. 

When the objects to be classified are dynamic, for instance nonlinearly parameterized func- 
tions of time of which the parameters are unknown a-priori, no adequate theory exists that 
explains why the fixed- weight RNN approach is successful. At present, theoretical results are 
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available to demonstrate that a single fixed- weight RNN of a certain type can approximate the 
solutions of multiple dynamical systems (Back & Chen, 2002). Hence in principle, a fixed- weight 
RNN can behave adaptively with respect to changes of its input signals. These theoretical re- 
sults, however, are restricted to the class of parameter replacement networks (Chen & Chen, 
1995). The structure of these networks differs from that of the more commonly used recurrent 
multilayered perceptrons. Whether adaptive behavior is inherent to other types of RNN, there- 
fore, remains an unresolved theoretical issue. In spite of plausibility arguments given by several 
authors (Feldkamp & Puskorius, 1997; Prokhorov et al., 2002), no formal proof has been made 
available, to the best of our knowledge. 

In this paper we consider adaptive behavior in fixed-weight RNNs from the standpoint of 
their ability to classify temporal signals adaptively. We provide a formal proof that continuous- 
time recurrent neural networks with fixed weighs can successfully classify and recognize nonlin- 
ear functions of time and unknown parameter. These functions are allowed to be nonlinearly 
parameterized. The main idea behind our results consists of presenting a prototype dynamical 
system which solves the recognition problem. This is followed by a proof that a RNN with fixed 
weights can realize this system. We construct such a system using the concepts of relaxation 
times and weakly attracting sets (Milnor, 1985; Gorban, 2004) as well as the tests for conver- 
gence to such sets obtained in our earlier work (Tyukin et al., 2007). To show that our system 
can indeed be realized by a RNN with fixed weights we employ classical results on function 
approximation by feed-forward networks (Cybcnko, 1989). 

The paper is organized as follows. Section 2 describes notational agreements. In Section 3 
we provide a mathematical statement of the problem. Section 4 contains the main results, and 
Section 5 concludes the paper. 

2 Notational Preliminaries 

• Symbol IR defines the field of real numbers, and symbol ]R>c, c e IR stands for the following 
set ]R>c = {x e R\x > c}, and R>c = {x e R\x > c}. 

• Symbol M" stands for an n-dimensional hnear space over the field of reals. 

• C'' denotes the space of functions that are at least k times differentiable. 

• Symbol /C denotes the class of all strictly increasing functions k : M>o — > M>o such that 
k{0) = 0; symbol /Coo denotes the class of all functions k, E JC such that lims^oo /^(•s) = oo. 

• Symbol © denotes concatenation of two vectors. 

• The solution of a system of differential equations x = f(t,x, 0,u{t)), f : MxR^xM'^xM'" — >• 
M", f e u : M>o ^ K"*, e E*^ passing through point xq at i = to will be denoted for t > to 
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as x(t, xq, to, 6, u), or simply as x(t, xq) or x(t) if it is clear from the context what the values 
of xo, are and how the function u{t) is defined. 

• By L'^[to, T], to > 0, T > to we denote the space of all functions f : ]R>o — > such that 
||f||oo,[to,T] = esssup{||f(t)||,t e [to,T]} < oo; ||f ||oo,[to,T] stands for the Ll^[to,T] norm of f(i). 

• Let ^ be a set in R" and || • || be the usual Euchdean norm in R". By the symbol ||-||_^ 
we denote the following induced norm: 

||x|U=i„f{||x-q||} 

In case x is a scalar and A e R>o, notation ||a;||A stands for the following 
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|a;| — A, |a;| > A 
0, \x\ < A 



3 Problem Formulation 



Consider the following set of signals 

^F-iMm^Oi)}, {!,..., 7V^}, 
/, :RxR^R, M;-)eC', (1) 

where 9i E G M. arc parameters of which the values are unknown a-priori, Q0 = [^min, ^max] 
is a bounded interval, and ^{t) is a known and bounded function. Signals fi{C{t), Oi) represent 
relevant physical variables of an object. 

For the given functions fi{^{t), 9i) and ^{t) we say that 9i is equivalent to 9[ iff 

fiim.Oi) = fiimA)"^ t ^^>o- (2) 

Hence an equivalence class for 9i e VLq can be defined as 

E,{9,) = {9[ e R| Oi) = fiim, 0[)yte R>o} (3) 

Equivalence classes (3) determine sets of indistinguishable parameterizations of the i-th signal. 
It is natural, therefore, to restrict ourselves to the problem of recognizing signals (1) up to their 
equivalence classes. 

With respect to the equivalence classes Ei{9i), we further assume that there is at least one 
point 9o E^ such that 

||^o||E,(e,) > e R>o V e Q^. (4) 

Requirement (4) is a technical assumption. It holds, however, for a wide range of practically 
relevant situations in which the union of Ei{9i) for all i and 9i belongs to an interval of M. 



Furthermore, it allows us to exclude from consideration pathological cases in which almost all 
points in D,o are indistinguishable in the sense of condition (2). 

In many systems, artificial or natural, measured physical quantities, represented here by 
signals fi{Ci't),di), are often unavailable. This is because a measurement device is involved in 
measuring fi{C{t), Oi). Given that signals fi{^{t), 9i) are functions of time, inherent dynamical 
properties of a measurement device would distort the measured values. Our present study takes 
this possibility into account. To do so we consider the case where signals fi{^{t), 9i) are affected 
by additive bounded noise and pass through nonlinear filters with uncertain dynamics. In 
particular, we assume that instead of functions fi{C,{t), Oi) wc access variables Si{t, Si^, 6i, ?7i(t)), 
which are solutions to the following ordinary differential equation: 

Si = -iPi{si) + fi{^{t), 9i) + r]i{t), 

(5) 

In (5) the function r^j : M>o ~^ 

e Loo[0,oo], ||r7i(t)||oo,[o,oo] < e M>o (6) 

corresponds to measurement noise. The value of in (5) is supposed to be known, while the 
values of initial conditions Sj(io) and functions : R — > R, (/?(•) e in (5) are assumed to be 
uncertain. We do, however, require that = [smin, Smax] is an interval and that the functions 
(Pi{si) satisfy the following constraint: 

VSj e R =^ (/?niin < ' < (/'max, </?min, </7max £ IR>0- (7) 

Condition (7) ensures that filters (5) are convergent (Pavlov, 2004), e.g. the dynamics of each 
variable Si{t, Si^, 9i, r]i{t)) at i ^ oo is uniquely determined in the absence of noise by fi{C{t), Oi), 
and the effects of initial conditions vanish with time asymptotically. 

A recurrent neural network is defined by the following set of differential equations: 

N 

= Yl ® ® ^) + ^j^rn), i e {1, ■ ■ ■ , N,}, (8) 

m=l 

X = col(a;i, . . . , xnJ, x(to) = xq, 

where functions cr : R — > R are sigmoid. Vectors Cj — col(cj,i, . . . , cj^n), bj = col(6j,i, . . . , bj^N) 
and matrices Wj = (w^^i, . . . , w^^jv) are the RNN parameters. Functions ^(t), s{t) : R>o — > R, 
^(t), s{t) G C° are inputs; x is the state vector, and Xq is a vector of initial conditions. 

According to notation (8) the network maps two functions of time ^{t), s{t) into the functions 
Xi{t, Xq), . . . ,XN.j:{t, Xq), which arc the solutions of (8). In what follows wc will consider variables 
^{t), s{t) as inputs to the network. While the variable ^{t) is known a- priori, variable s{t) is 
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allowed to vary within the set of functions Si{t, Si^g, 9i, r]i{t)), which are the solutions of (5). In 
particular, we assume that the following condition is satisfied: 

Assumption 1 (Existence) There exist i e Nf, 9i G fig, Si^o G fig and rjiit) specified by (6) 
such that 

s{t) = Si{t,Si,o,Oi,Vi{t))yt>0. (9) 

We aim to determine if there is a network of type (8) which is able to recover uncertain 

parameters i and 6i from the input s{ty, t > to G Il^>o within a finite interval of time for all 
to G ]R>o • Informally, this means that there exist two sets of functions of network state x and 
input s{t): 

{hfA^(t),s{t))}, {he,Mt),s{t))}, 

(10) 

hfj : M^^ X R ^ M, hej : K^^ x R ^ R, j G {1, . . . , Nf}, 

such that the values of i and 9i can be inferred from {hfj{x.{t), s{t))}, {/i6/,j(x(t), s(t))} respec- 
tively within a given finite interval of time. Formally we can state this as follows: 

Problem 1 Consider class T of signals (1), where the function ^{t) is known, and the values 
of parameters 9i are unknown a-priori. Determine a recurrent neural network (8) such that the 
following properties hold: 

1) there is a set of initial conditions fl^ such that x(t, Xq) is hounded for all Xq G fix and 
t >to & R>o; the volume of fix is nonzero; 

2) there exists a set of output functions (10) such that, for all 9i G fig, Sj^o £ f^s, to £ ^>0) 
Xo G fix, and functions r]i{t) given by (6), condition (9) implies existence of a constant T G R>o, 
time instant t' G (to,^o + T^), (arbitrarily large) T* G R>o, and (arbitrarily small) e G R>o and 
V G /Coo such that 

||%(x(t),s(t))||oo,[WTi <e + V{A,), 

^, inf , II Vi(x(^), s{t)) - 9l\\oo,[t',t'+T'] <e + V{\). 

In general, this problem has no solutions for all possible ^(t) G and /j(-, •) G C°. Consider, 
for instance, the case when fi{^{t), 9i) — sm.{^{t)9i) and 

/ sin2(ln(t-to + l)), sin(ln(t - to + 1)) > . 
^^^^-\ 0, sin(ln(t-to + l))<0 ^ * - 

Time intervals when ^(t) = arc growing unboundedly with time. Hence for any fixed T, T* 
thoro will always exist an iustaiit such that for all t > to lengths of intervals when ^(t) = 



^Because filters (5) arc convergent, the effect of uncertainty in parameter Sifi vanishes with time exponentially. 
Hence the only effective uncertainties are i and 9i. 
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exceed T + T*. For all such intervals, solutions Si{t,Sifl,9i,r]i{t)) do not depend on 9^. Hence 
recovery of the actual values of 9i from signal s{t) cannot be achieved within a fixed time interval 
[^0,^0 + ^ + T*] for all to > t'^. In order to enable a solution of the classification/recognition 
problem above we must introduce an additional constraint on the functions fi{C{t),9i). This 
should ensure that variation in parameter 9i can be detected from the values of fi{C{t),9i) 
within a finite time interval. We therefore require that the following property holds: 

Assumption 2 (Non-degeneracy) For the set of functions fi{^{t), 9i) specified by (1) and all 
t > to, 9i, 9[ there exist a constant T e ]R>o OLnd a strictly increasing function p : M>o M>o, 
p e /Coo such that the following condition holds: 

V i > to 3 e [t, t + T] : mit'), 0^) - f^im, > p (imE^iori) ■ (n) 

In case the equivalence classes Ei{9[) consist of single elements, e.g. when there is a unique 
value of 9[ — 9i satisfying (2), condition (11) will have a more transparent form: 

V i > to 3 e [t, t + T] : mit'), 9i) - fMt'), 0'^\ > p^ ' (12) 

These conditions simply state that within a fixed time interval the values of ll^ill^;.^^/) or — 
can be inferred from the differences fi{$,{t), 9i) — fi{$,{t), 9'^ for all t e M>o. 

In the next section we show that the solution to Problem 1 can be obtained for the class T 
of functions fi{^{t), 9i) that are Lipschitz in 9i. We present these results in the form of sufficient 
conditions formulated in Theorem 1. 

4 Main Results 

As was suggested in our previous work (Prokhorov et al., 2002), as well as in (Younger et al., 
1999) the reason why RNNs with fixed parameters (weights) demonstrate adaptive behavior 
could be found in their dynamics; supposedly, it is already sufficiently rich to have an adequate 
adaptation mechanism embedded into it. Finding a system which satisfies requirements 1), 2) 
in Problem 1 and which is, at the same time, realizable by a RNN, therefore, automatically 
constitutes an existence proof. This intuition, we will show, is correct. The result is provided 
in Theorem 1 below. 

Theorem 1 (Existence) Let functions ^{t), fi{^{t),9i) be given and defined as in (1), and 
Assumptions 1, 2 hold. Furthermore, suppose that fi{C{'t),9i) are (locally) Lipschitz^: 

3 De e M>o : m{t)A) - /.(CW-^^DI < De V t > 0, 9,,9[ (13) 

3D, GR.n: \f,{^-Od-fi{i'A)\<D^\i-i'\ V^,,^,^ (14) 

^Property (13) can be understood as a generalized Lipschitz condition. When equivalence sets Ei{0[) consist 
of single elements the property transforms into: \fi{£,{t), &i) — fi{^{t),&i)\ < Do\6i — 6^\. 
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and the time- derivative of ^{t) is bounded: 



< a^oo V t > 0. (15) 



Then for any T* G ]R>o, e G M>o there is a recurrent neural network (8) satisfying the 
requirements of Problem 1, provided that the upper bound for the L^\f),oo\-norms of the 
disturbance terms, r]i{t), is sufficiently small. 

Proof of Theorem 1. Wc prove the theorem in four steps. First, we present a dynamical 
system which will be referred to as the convergence prototype. We select this system in the 
following class of differential-algebraic equations: 

's^ = -V^{Si)+f\{mA) (16) 

^i = a + ^(a;, + l) (17) 



Xi = 7||si - s\\e {xi -Vi- Xi(xf + yf)) 
m = 7pi - 4e {xi + yi- Viixf + yf)) , 



(18) 



where 

7 G M>o, a, 6 G M, a< 9^,^, b > 9^^, e [a, 6], i=l,...,Nf, £ G M>o. (19) 

System (16)-(18) has a locally Lipschitz right-hand side and its solutions are bounded for all 
initial conditions Sj(to)i Xiito)^ Uii^o) ^ I^- We show that there exist (domains of) 7 > 0, e > 
and a point Sj(to) = Sq, XiitQ) = Xq, yiito) = y^, such that the trajectories passing through this 
point converge to the following target set 



0, 



9,. 



< eeie). (20) 

Em) 



Second, we prove that there is a point Xi{to) — Xq, yi{to) — y'o such that convergence is locally 
uniform with respect to the values of uncertain 9i and Si^. In other words, for all to > 0, 
Sifl G Qs: and 9i G He there exists r > such that solutions of (16)-(18) with initial conditions 
Xi{to) = ^0) yi{to) = y'o will be in an arbitrarily small neighborhood of (20) for all t > to + t. 

System (16)-(18), however, is not structurally stable. That is, small perturbations of its 
right-hand side might change asymptotic properties of the system drastically. Hence, due to 
the inevitable approximation errors, the chances that an RNN realization of (16)-(18) would 
solve Problem 1 are slim. To continue our argument we need to modify (16)-(18) such that the 
resulting system becomes structurally stable. 
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For this reason we, third, consider the perturbed version of system (16)-(18) 
Si = -(Pi{si) + fi{C{t),9i) 

n b-a^ (21) 

Oi = a + ^-{xi + 1) 

= liPi - s\\e + S) (xi -Vi- Xi{xl + yl)) 

(22) 

Vi = - s\\e + 5) [xi + yi- Viixl + yl)) , 5 e R>o 
aiming at achieving structural stabihty of an otherwise structurally unstable system. We show 
that trajectories of system (21), (22) periodically visit a small vicinity of (20) and stay there 
for an arbitrary long time, depending on the value of 5. Fourth, given that system (21), (22) is 
structurally stable, we apply the results from (Cybenko, 1989) to demonstrate that solutions 
of (21), (22) can be approximated in forward time over the semi-infinite interval [0, oo] by the 
state of a recurrent neural network specified by equations (8). 

1. Convergence prototype. According to Assumption 1 there exist i e {l,...,A'y}, Sj 
such that s{t) — Si{t, Sifi,9i,r]i{t)) for all t > 0. Consider the i-th subsystem of (16)-(18) and 
analyze the dynamics of the following difference: Si{t) — Si{t). Denoting 

ei{t) = s(t) - Si(t) = Si(t) - Si(t), 

^m) = hmA)-f^{mM^^m 

and using Hadamard's lemma we can derive the following estimate: 

|e.(t)| < e-/o -^W<^-|e,(0)| + ^ (l - e'/o^^W'^-) (||A/,(r)|U,[o,*] + h.(r)||oo,[o,oo]) (24) 
Given that ||77i(T)||oo,[o,oo] < ^r? for all t > 0, inequality (24) implies that 

(\em -^)< e-'^-"* f |e,(0)| - ^) + ^||A/,(r)|U,[o,] 
Hence the following estimate holds along the trajectories of (16): 

WemWe < e-'^-"*||e,(0)||, + J-||A/,(r)|U,[M, ^ = ^ (25) 

V-'min Vmin 



Taking (13), (25) into account plus the fact that 
that the following inequality holds: 



inf^.g£;.(g.) \6i — 9i\ we can conclude 



Wemi < e-^-'"*||ei(0)||e + -^1% - ^^(T)||oo,[o,t], 0^ e EiiOi) n [a, h]. (26) 

Vmin 

Let US now consider equations (17), (18). We pick up a point x\ y' which satisfies the 
following condition: 

x'^ + y'^ = 1. (27) 
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Solutions of (18) passing through this point can be defined as follows: 

Xi{t,x',y') = cos I / 'y\\si{T) - s{T)\\^dT + ly-^] , x' = cos(i/^), i/^e[0,27r] 

)Z { (28) 

yi{t,x',y') = sin I J j\\si{T) - s{T)\\edT + Uy] , y' = sm{uy), Uy e [0,2tt] 

This can be easily verified when writing (18) in the system of polar coordinates: Xi = r cos(z/), 
yi = rsin(i/) (Guckenheimer & Holmes, 2002): 

'f' = Ipi - s\\e ■ r(l - r) 

Given that 9i belongs to the interval [a,b], there is a number h{9i) e [0, tt] such that for all 
/c e Z the following equivalence holds 

ei = a+ (cos(/i(^i) + 27rA;) + l) . (30) 

Hence according to (17), (28) the norm — ^i(r)||oo,[o,t] can be estimated from above as follows: 

b — a /"* 
ll^i - Oi{r)\Uio,t] < -^WHOi) + 27rk - 7||s,(r) - s{T)\\,dT\\oom (31) 

Denoting 

c=-^^-— ^; h{t,9i,k)^h{9i)-i'^ + 2nk- f -f\\si{T) - s{T)\\^dT 

V^min 2 Jq 

and taking into account (26), (31) we can conclude that the following holds along the solutions 
of (16)-(18): 



» V"/ lie 



<e-'^-'"*||e,(0)||, + c||/i(r,^„A;)|U,[o,i]; 



h{0,ei,k)-h{t,ei,k)^ [ -f\\ei{r)\Udr 

Jo 



(32) 



According to (Tyukin et al., 2007) (Theorem 1 and Corollaries 2, 3) there exist 7* G ]R>o and 
h* such that for a given bounded ej(0), all 7 G M>o, 7 < 7* and h{0,9i,k) > h* the norm 
||ej(T)||oo,[o,oo] is bounded and 

lim h{t, 9i, k) e [0, h{0, 9i, k)]. (33) 

t— »oo 

The value of 7*, according to Corollary 3 in (Tyukin et al., 2007), can be determined from the 
following inequality 



The value of h* can be estimated from: 



|e.(io) II. < ( fin 5) " ^ - c f 2 + ) ) r (35) 



■y* \ dJ K \ ^ — dy 

Given that ||ei(to)||£ in (35) is bounded from above for all to > 0, ||ei(io)||£ < Smax — Smin + 
Del ^m.in{b — o), condition 

..,(,„„_,„,.),i^)(^(,„«)-'^i^_.(.,_^))-' (3a, 

together with (34) imply that for all Si(to) ^ and h{0,9i,k) > h* the norm ||ej(r)||oo,[o,oo] is 
bounded and property (33) holds. 

Notice that in the definition of h{0, 9i, k): 

h{0, Oi, k) = h{9i) + SttA; (37) 

the value of k can be chosen arbitrarily large. Moreover, h{9i) e [0, tt] for all 9i e [a,b]. This 
imphes that there exists a finite k' such that condition /i(0, 9i, k') > h* will be satisfied for any 
fixed h* (i.e. for all 7* satisfying (34)) and all 9i e [a, b]. In addition, the following will hold: 

lim h{t, 9i, k') e [0, /i(0, 9i, k')] C [0, tt - i/^ + 27rA;'] V e e [a, h]. (38) 

t— »oo 

Taking (28) into account we can conclude that solutions Xi{t,x',y') converge to a point in 
the interval [—1, 1] as i ^ 00, and vector {xi(t,x',y'),yi(t,x',y')) makes no more than k' full 
rotations around the origin for all 9i G [^^min, ^max]- Hence for a given initial condition Xi{0) = x', 
yi{0) = y', Si^o e Qs and 9i e [6'mm,6'max] the estimate 9i(t) = a + (6 - a)/2 ■ {xi{t,x',y') + 1) 
converges to a point in [a, 6] as t — > 00. We denote this point by symbol 9*. 

Given that 9i{t) converges to a limit, there exists a time instant t* such that for all t >t* the 
following condition holds: \9i{t) — 9*\ < //qo, where //qq G ]R>o is an arbitrarily small constant. 
Therefore, taking condition (13) into account, we can conclude that for all t > t* derivative 
satisfies the following equation: 

ei = -a{t)ei + /,(e(t), 9^) - M^t), 9*) + Hi{t) + i^i{t) (39) 

where |A*i(i)| < Dq is a continuous function. 

Now we will show that the norm ||^i||£;.(g.) can be bounded from above by a A^oo-function 
of A,,. Consider the term fi{C,(t),9i) — fi{C^(t),9*). According to (11) there exists a sequence 
of monotonically increasing time instances tj, j = 1,2,... such that tj^i — tj < 2T and 
\fi{^{tj),9i) - fii^itj),9*)\ > pi\\9i\\E^0,^). Furthermore, according to (14), (15), the time- 
derivative of fi{C{t), 9i) - fi{C{t), 9*) is bounded: 



j^h{mA)-mt)M) 



<2D^-dU = Df 
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Hence the following estimate holds: 

-t+L 



2Df 



(40) 



Df 

In order to proceed further we will need the following lemma. 
Lemma 1 Consider the following differential equation 

Z = -ip(t, Z) + U(t) + r](t), Zo = Z(0) e [Zmin, -^max] C R (41) 

Let us suppose that 

1) ip{z)z > 0, (Pmin < d(f{t, z)/dz < 

'/-'max; 

2) U{t) e Loo[0,Oo] nC^ ||xi(t)||oo,[0,oo] < U^, ||lt(t)||oo,[0,oo] < ^^^oo 

3) r]{t)eL^[Q,<x^\, ||77(^)||oo,[o,oo]< A 

4 ) there exist constants L, 5 such that for all t > 

rt+L 

\u{T)\dT>5 (42) 



5 ) finally assume that the following inequality holds 

Akoo > 0. (43) 



5-' 



L 

Then for any p e M>o there exist constants L* > and 6* > {{6/Ly — Aurx))/p, such that 

\z{T)\dT>S*>-i — -AuooL\Wt>0 (44) 

Proof of Lemma 1. Wc prove the lemma along the lines of an argument provided in (Loria 
et al., 2003) (Property 1). Consider the time- derivative of zu: 

^ (zu) = {-(fit, Z) + U + 7])U + ZU>U^ - \z\ (<^max + C^^oo) " 1^1 A (45) 

According to (45) for all t, to e M>o, t > to the following inequality holds: 

ft pt pt 



z{t)u{t) - z{to)u{to) > / u'^{r)dT - {(praax + duoo) / \z{r)\dT - A \u{T)\dT 

J to J to J to 

Rearranging terms in (46) yields 

{<Pmax + du^) / \z{T)\dT > z{to)u{to) - z{t)u{t) + / U^{T)dT " A / \u{T)\dT 
J to J to J to 



(46) 
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Notice that z{to)u{to) — z{t)u{t) is bounded from below for all t >0. We denote this bound by 
symbol M. Furthermore, according to the Holder inequahty and property (42), the following 
estimate holds for alH > 0: 

r2 1 / rt+L \2 

Hence for all time instances t: {n + 1)L >t — to> nL, where n is a positive integer, we have 

rt p rt 

(^^max + <9Moo) / \z{T)\dT > M + U— - A / \u{T)\dT 

J to ^ J to (A7\ 
5^ X y ') 



> M + n— - (n + 1) Auoo = (M - Au^oL) + " AuooL 

According to the requirements of the lemma, inequahty (43), the difference — Au^oL > 
is a positive constant. Therefore, there exists n — n' such that the right-hand side of (47) 
exceeds some S' — {S^/L — AuooL)/p' e R>o, p' e M>o. Choosing t' — mmt{t — to} > n'L we 
can conclude that 

(<^max + 5«oo) / \ziT)\dT>d' (48) 
J to 

Given that we could chose the value of to arbitrarily in the domain ]R>o, inequality (48) is 
equivalent to 

t+L* 

\z{T)\dT > 5*, 

where L* ^ t' - to, S* = S'/ {(p^^ + du^o) = - Au^L)/p, p ^ p' {(p^^ + du^). The 

lemma is proven. 

Denoting fi{^{t), 9i) — fi{^{t), 9*) — u{t), r]i{t) + ijLi{t) — r]{t) we can observe that equation 
(39) is of the same class as (41) in the formulation of Lemma 1. Furthermore, the following 
inequalities hold: 

A<A^ + D0 11^; ||ii(i)||oo,[o,oo] < De \\di\\E,(e*) < Mb - a) (49) 

Notice that the value of fioo in (49) can be made arbitrarily small because 9i{t) converges to a 
limit, and 9* can be chosen from its arbitrarily small vicinity. Let us therefore chose 9* such 
that Dq/Ioo < Ar,. Hence, in accordance with Lemma 1, condition 

implies existence of constants L*, p E ]R>o such that 



e,(r)MT > ^ IP}I^^M.\ i. _ Ax^^L | = <5* > W > i*. (51) 
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We are going to show now that the norm 116*111^.^^,) is bounded from above by a function 
e0{Ajj) e ]Coo for all sufficiently small A^. Let us parameterize as follows: 

Parametrization (52) is always possible because p(-) G /Coo- For all 11^^(9*) > ^* condition 

(50) is satisfied. Hence, according to Lemma 1 there exist constants L*, p such that inequality 

(51) holds. Given that S*, L*, (/^min e R>o there will always exist a number A* e R>o such 
that A* < (L*)~M*</7niin/2. This imphes that for all A^ < A* the following inequality holds 



¥i{r)UT>'-, 8 = ^. (53) 



Let us suppose that the norm ll^jll^;.^^*) is greater than e*. In this case (50), (53) hold and 
the integral 

J^^ \\ei{T)\\edT (54) 

grows unboundedly with t. On the other hand, according to (32), (33) integral (54) is bounded. 
Hence we have reached a contradiction. This implies that < s*. Given that p(-) G JCoo, 

the inverse p~^{-) is well defined and is a /Coo-function. Therefore, taking (52) into account, we 
can conclude that the latter inequality is equivalent to: 

P^\\E,iet) < P~' {{8\Mb - a)D)L'y") (55) 

Thus we have just shown that there exists a point x' , y' in system (16)-(18), and parameters 
7 and e such that the system trajectories starting from this point converge into a small neigh- 
borhood of Ei[9i) in finite time for all Si^ G Vis and any given 9i G [^^min, ^max]- The size of this 
neighborhood can be characterized by a /Coo-function of A^, when A^ is sufficiently small. Let 
us now show that this convergence is uniform with respect to 9i. 

2. Uniformity. Gonsider equation (38). According to (32), (38) trajectories passing through 
a point {x' , y') satisfying (27) at i = also satisfy the following constraint: 

3k' el: /i(0) - /i(oo) = 7 / \\ei{T, ei{Q),di,r]i{T))\\edT < - + 2T:k' < oo (56) 

io 

for all 9i G [^min, ^max] and ej(0). We will use this property to demonstrate that there is a point 
(,t',|/'), \/.«'^ + y''^ = 1, \\Oi{x')\\E,{ei) > Ao, Ao G M>o, such that for any Oi G [^min,^ma^] the 
estimate 9i{xi{t,x',y')) converges into a set 

mh^^e,) < ((8A,D,(6 - a)D}L'Y^') (57) 
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in finite time T'{6i) for all t^, e ^2^, and stays there for all t > to + T'{6i). Furthermore, 
the value of T'{9i) is bounded from above for all 9i e [^min, ^max]- In other words, there exists 

T'iOi) <T^^y eie[e^^,e^^\. (58) 

The fact that estimate 6i converges into a set specified by (57) in finite time T'{6i) and stays 
there for t > to + T'{9i) for all x' , y' : x'"^ + y'"^ = 1 follows immediately from (55). We must 
show, however, that (58) holds. 

According to (4), (19) there is a point Oq G [a, b] such that ||^o||£;i(ei) > for every 6i e flo- 
Hence, there exists a point 9i^i e [a, b] such that 

inf \\ei-ei,4=Ae 

ei€Ei{0i)n[a,b] 

Without loss of generality, suppose that the set Qi = {9i G Ei{9i)n[a, b]\ Oi^i > 9i} is not empty^. 
By symbol ^i,max we denote ^i,max = sup{f2i}. Let us pick a point 9i^2 G [«, b] according to the 
following constraints 

(59) 

9i,i > 6'i,2 > 9i^ 
and choose the value of in (28) such that 

b — ct 

9i,2 = a-{ —{cos{iyj:) + 1), I/a; G [0,7r]. 

According to (30) there exist hifii^^ax)-, k such that 

^i,max = a + ^y^(cOs(/i(^i,ma^) + 2Txk) + 1), h{9i^ra^^) G [0, Tt], /c G N. 

Given that 9i^2 > ^j,max we set the value oi k — and chose /i(^i,max) in accordance with the 
following inequality: 

l^x < H9i,mi,x)- (60) 

Because |^i(cos(z/^)) — ^i(cos(i/^))| < ^{I'x — ^'A fo'^ ^ ^x-, ^'x ^ conditions (59), (60) ensure 
existence of a constant v'^ < /i(^i,max), i^'x — + Ae/{'^{b — a)) such that 

|^,(cos(i/,)) - Hcos{u'^))\ < A,/4 V u'^ G [u,, u'J. (61) 

Hence, 

II^,(cos(//;))iu,(,,)>^ vz.:GK,i.;]. 



^If Oi is empty then Q2 — {(^i G Ei{9i) n [a,b]\ Oi^i < 9i} is not empty. We can proceed with the same 
argument replacing interval [0, tt] with [tt, 27r] and sup with inf when appropriate. 
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The inequality above implies that the values of ^j(cos(z/^')) are outside of the A5//4-neighborhood 
of Ei{9i) for all u'J. e [ux^u'J. Furthermore, because ^i(cos(-)) is monotone (non-increasing) 
over [i'x,h{0i,uiax)), and 9i^2 > ^i,max, there are no values of e K,^(^i,max)) such that 
||e;(cosK))|U,(,,) = 0. 

Let us consider solutions of system (16)-(18) passing through the following point Xi{0) — 
cos(z/3;), yi{0) = sin(z/2.), Sj(0) e fig- Suppose that < 7 < 7*, and 7* satisfies (36) with 
h* = Ag / {2{b — a)) . Then, according to (Tyukin et al., 2007) the sum Vx + 7 Jq \Wi{'T)\\sdT 
converges to a point in [u^, /i(^i,max)]- Taking monotonicity and continuity of function ^j(cos(z/^')) 
for G [z^.T, /'(^^i.max)] into accouut, we can conclude that trajectory 9i{xi{t,x'{9i))) enters the 
£*-neighborhood of ^^j,max only once for all t G [0, 00]. 

Let us show that amount of time required for the system to enter this neighborhood is 
bounded from above for all 9i G Qo- Given that trajectory 9i{xi{t,x' ,y')) enters the e*- 
neighborhood of ^i,max only once, we shall show that the amount of time the system spends 
outside of this neighborhood is bounded from above for all 9i G Q^q. We prove this by contra- 
diction. Suppose that for any fixed Tq G M>o there is a G [^min, ^max] such that T'{9i) > Tq. 
Consider dynamics of (16)-(18) when s(t) = Si{t, Sifl,9i,r)i{t)). Let us pick a sequence of time 
instances {tj}°^i, such that tj+i—tj = Dt, and Dt > L* . For each interval [ij, ij+i] we consider 
two possibilities: 

1) the norm \\9i{tj) - ^j(r)||oo,[t,.,t,+i] < e, e G M>o, e < D^^A^^, and 

2) the norm \\9i{tj) - ^j('^)lloo,[t,-,t,+i] > e. 

In case the first alternative applies, according to (53) the following estimate holds J*^'^^ W^ii'T) Wsdr > 
5*. Rence h{tj)—h{tj+i) >^S*. When the second alternative holds, e.g. \\9i{ti)—9i{T)\\oo,[tj,tj+i] > 
e, we can conclude, using inequality (31), that 



|7 



r 2 

||ei(ri)||,(iri||oo,fe,t,+i] > e^— ^. 



Given that h{t) is monotone with respect to t we obtain that h{tj) — h{tj+i) > e2/ (6 — a). Thus 
we have shown that 

h{tj) - h{tj+i) > min{75*, e2/(6 - a)} = 

for all j such that 9i{r) > e* for all r G [tj,tj+i]. Given that h{t) is non-increasing and 

T' is arbitrarily large, there will be a time instance tm < T' such that ^™ h[tj) — h{tj+i) > 
mAfi > n — Ux + 27rk'. This, however, contradicts to (56). Hence property (58) is proven. 

3. Structurally stable prototype. So far we have shown that for the given system (16)-(18) 
there exists a non-empty set of parameters 7, e, and x', y' : \/ x''^ + y'"^ = 1 such that trajectories 
Xi(t,x',y'), yi{t,x',y') converge to a point on the unit circle in M^, and variable 9i{xi(t,x',y')) 
reaches a given small vicinity of Ei{9i) (see (57)) within finite time T^^x ior all 9i G [6*111111, 6'max]- 
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Let us now consider perturbed system (21), (22) where S e ]R>o and initial conditions are 
selected in a neighborhood of x', y': 

(x,(0),y,(0)) e n{x',y') = {{x,y) e {x - x'f + {v - v'f < ^r], Sr G E>o. (62) 

In order to distinguish solutions of (21), (22) from the solutions of unperturbed system (16)- 
(18), we denote the latter by symbols x*(t, Xi{0), ?/i(0)), y*(t, Xi{0),yi{0)), and s*(t, 9i, Si^, r]i{t)). 
For the sake of notational compactness we also denote the state vector of the i-th subsystem 
of (16)-(18) as q* = (s*, x*, y*), and the state vector of the i-th subsystem of (21), (22) as Qj. 
Solutions of (21), (22) are bounded: 

||si(^,Si,o,^i(^))||oo,[o,oo] < |sj,o| + (max{|a|, |6|}L>e + A^)/</?min, 
||xi(t,x,(0),i/i(0))||oo,[o,oo] < max{l, ^ x,{Qf + y,{Qf}, (63) 
||7/i(i,Xi(0),yi(0))||oo,[o,oo] < max{l, v'a;i(0)2 + ^^(O)^}. 

Hence for all 5^(0), a;i(0), |/i(0) e Q:sX^{x' , y') there exists a constant Dq such that ||qi(t)||oo,[o,oo] < 
Dq for all 9i. Let us rewrite (21), (22) as follows: 

Si = -<Pi{si) + fi{i{t), Oi{xi)) 

Xi = 7||si - sll^ {xi -yi- Xi{xl + yf)) + 7^ • e^dxi, yi) (64) 
m = 7pi - s\\e {xi + yi- yi{xl + yf)) + 7(5 ■ Syi^Xi, yi), 

where 

e^{xi{t),yi{t))^Xi{t)-yi{t)-Xi{t){xl{t)+y''i{t))] 

ey{xi{t),y,{t)) = x,{t)+yi{t) - y,{t){xf{t) + y^it)) 

The right-hand side of (16)-(18) is locally Lipschitz in Sj, Xi, yi (and so is the right-hand side 
of (21), (22)). We denote its corresponding Lipschitz constant in the domain specified by (63) 
by symbol Lj(Do). Furthermore, provided that (63) holds, ex{xi{t),yi{t)), ey{xi{t),yi{t)) are 
globally bounded with respect to t. Let us denote this bound by symbol B: 

max{||£^(xi(i),yi(i))||oo,[o,oo], \\ey{xi{t),yi{t))\\^^[o,oo]} = B 

For the sake of notational compactness let us rewrite (64) as follows: 

q, = f(q„5(t),e(t))-f75-g(q,), (65) 

where f (q^, s(i), ^(t)) and g(qi) are defined to copy the right-hand side of (64). Notice that 
||f(q^,5(t),e(t))|| < i^^(^o)||q^||, ||g(qO|| < 3^2. 
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According to the theorem on continuous dependence of solutions of an ODE on parameters 
and initial conditions (see, for instance, (Khalil, 2002) Theorem 3.4, page 96) the following 
holds: 

Hit) - q:(t)|| < ||q.(to) - q:(to))||e^»(^°)(*-*°) + (e^^Po)(t-to) _ i) . (gG) 

When the values of Sj_o and s*q coincide estimate (66) implies that 

Hit) - q*it)\\ < 5.e^^(^°)(*-*°) + (e^>(^o)(t-to) _ . (g?) 

This assures existence of 6r G IR>o, ^ G ]R>o such that for a fixed, yet arbitrarily large, time 
T"iSr,S) > T4ax solutions of system (21), (22) passing through a point from Qix',y') at t = to 
will remain within a fixed, yet arbitrarily small, neighborhood of a solution of system (16)-(18) 
with initial conditions Xj(io) = x', yiito) — y'. The value of T^^x does not depend on Sr, S. 

Taking (29) into account, we can conclude that the set + y^ — 1 is globally attracting in 
the state space of system (21), (22) for almost all initial conditions (except when Xiito) — 0, 
Uiito) — 0). This implies that solutions starting in Q(a;',y') will remain there. In addition, 
according to (28), for any ^ a 5r-vicinity of ix',y') will be visited within at least time 
t' < to + 27r/(7 • 5). Hence we have just shown that for all to > solutions starting at 

X n(x', y') approach the target set within a fixed time T^^x and stay in its vicinity for 
arbitrarily long time T"iSr, S). The latter time is a function oi Sr, S: the smaller the values of 
6r, S, the larger the value of T"iSr, S). 

4- Realizability. Let us finally show that system (21), (22) can be realized by a recurrent 
neural network. More precisely, we wish to prove that there exists a system (8) such that 
X = Ci ® C2 ® • • • ® Cjv^, C e K^ Ci = Ci,i ® Ci,2 e Ci,3, i = {l, . . . , A^/} and solutions Qit, qi,o) 
are sufficiently close to qj(t, qi,o), where qj^o G Jl^ x Q(x', y') C R^. 

It is clear that the right-hand side of (21), (22) is a continuous and locally Lipschitz function. 
To proceed further we use the following result by Cybenko (Cybenko, 1989): 

Theorem 2 (Cybenko, 1989) Let a : M ^ M 6e any continuous sigmoid-type function. Then 
finite sums of the form 

N 

GiC) = (^M^JC + Pj), C e M™, e K"', aj, e ^ 

are dense m C[0, l]". 

According to Theorem 2, for any arbitrarily small Sn ^ R>0) any given bounded intervals 
C R, ilj^ C R, and any 

sit), at): max{||s(t)||oo,[o,oo],||e(^)l|oo,[o,oo]}<M, M e M>o, 
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there exist N en, ujj e E^ aj e E, /J^- G M, j = 1, 2, . . . , AT such that 

< £iv, (68) 

where e ® xQ-r x Q^^. It follows from (68) that there exist N, ujj, aj, Pj such that 

N 

J2 o^A<^j ■ m ® s(t) e Q + Pj) = f (c„ s(t),m) + ■ g(c) + a(c„ s(t),m), m 

3=1 

where A(^j, s(t), ^(t)) is continuous and 

|A(C,s(i),e(i))|<£iv. 

Let us chose Q.^ — — f^] where v e M>o, v > 1 and consider the dynamics of 

C, = f (C, s{t),m) + l5 ■ g(C) + A(C, 5(t), e(t))- (70) 

System (70) has a globally attracting invariant set (for almost all initial conditions) which can 
be characterized as follows 

{C, e R^l 1 - p{eN) < Ci + Cja < 1 + P(£iv)}, P e ^oo- 

This follows immediately from the fact that (65) is structurally stable and has a globally 
attracting invariant set (for almost all initial conditions). Furthermore, for any given £jv and 
a bounded set of initial conditions Q^(r) = {C, £ E^l ||Cill < ^ ^>g} there exists constant 
Bi such that ||Ci(Olltx),[o,oo] < Bi . Hence solutions of system 

AT 

c = E "^-^K ■ © <t) ® e(t) + Pi) (71) 

are bounded for all initial conditions from f2^(r) provided that inequality (68) holds over suffi- 
ciently large intervals Q.^^ (for sufficiently large v). Furthermore, given that is sufficiently 
small, solutions of (71) enter domain Vtg x Q(x',y') specified by (62) in finite time. Finally, ac- 
cording to equality (69) and Theorem 3.4 in (Khalil, 2002), solutions of (71) starting in fl{x', y') 
satisfy the following inequality: 

l|q^(t,q.o) - C(t,q,o)|| < (e^'(^°)(*-*°) - l) , q,o e 1). x n{x\y'). (72) 

Hence, for any t > 0, solutions of (71) starting from f2f(r) approach the target set within a 
fixed time (dependant on 5) and stay in its vicinity arbitrary long provided that 5 and £jv are 
sufficiently small. The possibility of the latter follows from Theorem 2. 
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Taking (72), (67), (23), (21) into account we conclude the proof by choosing hf^i{x.,s), 
h0^i{x,s) as follows 

hf,ii^, s) = /i/,i(Ci e • • • © CiV/, s) = s - 0,1, 

b - a (73) 

/ie,i(x, S) = %(Cl © • • • © CjVy> S) = « + ^—(0,2 + 1) 

The theorem is proven. 

Before concluding this section we would like to provide several remarks regarding Theorem 1. 

Remcirk 1 (Read-out from the outputs) As follows from the theorem the class of signal 
s{t) — Si{t, Sifi,9i,r)i{t)), e.g. parameter i, can be inferred from the values of hfj{'x{t),s{t)), 
j — {1, . . . , Nf} within a finite interval of time. The values of /i/,i(x(i), s{t)) should approach 
a small neighborhood of zero and stay there for a sufficiently long time. The estimate of 9i up 
to its equivalence class is available from the values of he,i{p^{t), s{t)) over the same interval. 

Prom a practical viewpoint, however, it is preferable to read-out from the RNN outputs 
explicitly, rather than having to satisfy ourselves with the existence of two sets of read-out 
functions, for state and input, respectively, of the RNN. Even though this option is not stated 
explicitly in Theorem 1, it can be easily shown that the preferred option can, indeed, be realized. 
Adding to recurrent subsystem (8) Sk feed-forward ^wi realizing continuous "output" functions 
(73) enables exphcit read-out from the RNN outputs. 

Remark 2 (Convergence to an attractor) Theorem 1 does not imply that recognition of 
a class of the input signal s{t) involves convergence of the RNN state to an attractor. Yet its 
formulation does not exclude this option either. In fact, when fi{^{t),9i) satisfies some addi- 
tional restrictions (e.g. linear or monotone parametrization with respect to 9i), it is possible 
to replace (17), (18) with another prototype system: one that converges to a point attractor 
exponentially (Tyukin et al., 2007). This implies that it depends substantially on the proper- 
ties of f licit), 9i) whether the state of a network will behave intermittently or asymptotically 
converge to an attractor. It is important, however, that in both cases the recognition problem 
will be successfully solved by a RNN. 

Remark 3 (Multidimensional uncertainty) Even though the theorem applies to the case 
where 9i is a scalar, it can be trivially extended to the case where uncertain parameters are 
vectors from a bounded domain Q^i C M*^. To do so one needs to find a Lipschitz mapping 
A : ]R — > M*^ such that for a given small £x e ]R>o the following property holds: 

V6>, eQ,,d3e, eQe: \\ei-\i9i)\\<sx 
Hence the problem will reduce to the scalar case to which Theorem 1 applies. 
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5 Conclusion 



We provided a theoretical justification to the important question why an RNN with fixed 
weights can serve as a universal adaptive classifier of both static and dynamic inputs. In 
addition to providing an existence proof we have proven that the number of dynamical states 
in an RNN recognizing n different signals Si{t) can be as small as 3n, i.e. grows linearly with 
the size of the set of uncertain signals to be classified. 

We stated the classification and recognition problems in a behavioral context in which, over 
time, the desired input-output relationship is achieved. Finding a solution corresponds to a 
network dynamics in which the state reaches a given neighborhood of the a-priori specified set 
and stays there for sufficiently long time, provided that input to the network belongs to a given 
class (Problem 1). With these ramifications, RNN solve the problem of adaptively classifying 
time-dependent signals. We did not set out to guarantee, however, that the state of the RNN 
will asymptotically converge to an equilibrium or its small vicinity as a result of recognition. 
On the other hand the amount of time a network would spend in the vicinity of a target set 
can be made sufficiently large to qualify as a practical solution to the classification problem. 
For classification, after all, asymptotic convergence is not needed. 

In physics and nonlinear dynamics the phenomenon that the state of a system reaches a 
neighborhood of a set and stays there sufficiently long, yet inevitably escaping - only to get 
caught again, is called (chaotic) itinerancy (Kaneko & Tsuda, 2003); the set is referred to as an 
attr actor-ruin. These descriptive concepts are currently recognized as a possible mathematical 
basis for modeling brain activity (Tsuda, 1991; Tsuda & Fujii, 2004). We envisage that our 
current result supports this idea, by showing the considerable power of these systems to perform 
adaptive classification. 
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